NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Multilevel Diffusion: Infinite Dimensional Score-Based Diffusion Models for Image Generation

https://doi.org/10.1137/23M1614092

Hagemann, Paul; Mildenberger, Sophie; Ruthotto, Lars; Steidl, Gabriele; Yang, Nicole Tianjiao (September 2025, SIAM Journal on Mathematics of Data Science)

Free, publicly-accessible full text available September 30, 2026
Efficient Neural Network Approaches for Conditional Optimal Transport with Applications in Bayesian Inference

https://doi.org/10.1137/24M1678659

Wang, Zheyu Oliver; Baptista, Ricardo; Marzouk, Youssef; Ruthotto, Lars; Verma, Deepanshu (August 2025, SIAM Journal on Scientific Computing)

Free, publicly-accessible full text available August 31, 2026
Neural network approaches for parameterized optimal control

https://doi.org/10.3934/fods.2024042

Verma, Deepanshu; Winovich, Nick; Ruthotto, Lars; van_Bloemen_Waanders, Bart (January 2025, Foundations of Data Science)

We consider numerical approaches for deterministic, finite-dimensional optimal control problems whose dynamics depend on unknown or uncertain parameters. We seek to amortize the solution over a set of relevant parameters in an offline stage to enable rapid decision-making and be able to react to changes in the parameter in the online stage. To tackle the curse of dimensionality arising when the state and/or parameter are high-dimensional, we represent the policy using neural networks. We compare two training paradigms: First, our model-based approach leverages the dynamics and definition of the objective function to learn the value function of the parameterized optimal control problem and obtain the policy using a feedback form. Second, we use actor-critic reinforcement learning to approximate the policy in a data-driven way. Using an example involving a two-dimensional convection-diffusion equation, which features high-dimensional state and parameter spaces, we investigate the accuracy and efficiency of both training paradigms. While both paradigms lead to a reasonable approximation of the policy, the model-based approach is more accurate and considerably reduces the number of PDE solves.
more » « less
Full Text Available
Alternating Minimization for Regression with Tropical Rational Functions

Dunbar, Alex; Ruthotto, Lars (June 2024, Algebraic statistics)

We propose an alternating minimization heuristic for regression over the space of tropical rational functions with fixed exponents. The method alternates between fitting the numerator and denominator terms via tropical polynomial regression, which is known to admit a closed form solution. We demonstrate the behavior of the alternating minimization method experimentally. Experiments demonstrate that the heuristic provides a reasonable approximation of the input data. Our work is motivated by applications to ReLU neural networks, a popular class of network architectures in the machine learning community which are closely related to tropical rational functions.
more » « less
Full Text Available
Differential Equations for Continuous-Time Deep Learning

https://doi.org/10.1090/noti2930

Ruthotto, Lars (May 2024, Notices of the American Mathematical Society)
Malek-Madani, Reza (Ed.)
This short, self-contained article seeks to introduce and survey continuous-time deep learning approaches that are based on neural ordinary differential equations (neural ODEs). It primarily targets readers familiar with ordinary and partial differential equations and their analysis who are curious to see their role in machine learning. Using three examples from machine learning and applied mathematics, we will see how neural ODEs can provide new insights into deep learning and a foundation for more efficient algorithms.
more » « less
Full Text Available
PyHySCO: GPU-enabled susceptibility artifact distortion correction in seconds

https://doi.org/10.3389/fnins.2024.1406821

Julian, Abigail; Ruthotto, Lars (May 2024, Frontiers in Neuroscience)

Over the past decade, reversed gradient polarity (RGP) methods have become a popular approach for correcting susceptibility artifacts in echo-planar imaging (EPI). Although several post-processing tools for RGP are available, their implementations do not fully leverage recent hardware, algorithmic, and computational advances, leading to correction times of several minutes per image volume. To enable 3D RGP correction in seconds, we introduce PyTorch Hyperelastic Susceptibility Correction (PyHySCO), a user-friendly EPI distortion correction tool implemented in PyTorch that enables multi-threading and efficient use of graphics processing units (GPUs). PyHySCO uses a time-tested physical distortion model and mathematical formulation and is, therefore, reliable without training. An algorithmic improvement in PyHySCO is its use of the one-dimensional distortion correction method by Chang and Fitzpatrick to initialize the non-linear optimization. PyHySCO is published under the GNU public license and can be used from the command line or its Python interface. Our extensive numerical validation using 3T and 7T data from the Human Connectome Project suggests that PyHySCO can achieve accuracy comparable to that of leading RGP tools at a fraction of the cost. We also validate the new initialization scheme, compare different optimization algorithms, and test the algorithm on different hardware and arithmetic precisions.
more » « less
Full Text Available
A Neural Network Approach for Stochastic Optimal Control

Li, Xingjian; Verma, Deepanshu; Ruthotto, Lars (June 2024, SIAM journal on scientific computing)

We present a neural network approach for approximating the value function of high- dimensional stochastic control problems. Our training process simultaneously updates our value function estimate and identifies the part of the state space likely to be visited by optimal trajectories. Our approach leverages insights from optimal control theory and the fundamental relation between semi-linear parabolic partial differential equations and forward-backward stochastic differential equations. To focus the sampling on relevant states during neural network training, we use the stochastic Pontryagin maximum principle (PMP) to obtain the optimal controls for the current value function estimate. By design, our approach coincides with the method of characteristics for the non-viscous Hamilton-Jacobi-Bellman equation arising in deterministic control problems. Our training loss consists of a weighted sum of the objective functional of the control problem and penalty terms that enforce the HJB equations along the sampled trajectories. Importantly, training is unsupervised in that it does not require solutions of the control problem. Our numerical experiments highlight our scheme’s ability to identify the relevant parts of the state space and produce meaningful value estimates. Using a two-dimensional model problem, we demonstrate the importance of the stochastic PMP to inform the sampling and compare to a finite element approach. With a nonlinear control affine quadcopter example, we illustrate that our approach can handle complicated dynamics. For a 100-dimensional benchmark problem, we demonstrate that our approach improves accuracy and time-to-solution and, via a modification, we show the wider applicability of our scheme.
more » « less
Full Text Available
LSEMINK: a modified Newton–Krylov method for Log-Sum-Exp minimization

https://doi.org/10.1553/etna_vol60s618

Kan, Kelvin; Nagy, James G; Ruthotto, Lars (January 2024, ETNA - Electronic Transactions on Numerical Analysis)

This paper introduces LSEMINK, an effective modified Newton–Krylov algorithm geared toward minimizing the log-sum-exp function for a linear model. Problems of this kind arise commonly, for example, in geometric programming and multinomial logistic regression. Although the log-sum-exp function is smooth and convex, standard line-search Newton-type methods can become inefficient because the quadratic approximation of the objective function can be unbounded from below. To circumvent this, LSEMINK modifies the Hessian by adding a shift in the row space of the linear model. We show that the shift renders the quadratic approximation to be bounded from below and that the overall scheme converges to a global minimizer under mild assumptions. Our convergence proof also shows that all iterates are in the row space of the linear model, which can be attractive when the model parameters do not have an intuitive meaning, as is common in machine learning. Since LSEMINK uses a Krylov subspace method to compute the search direction, it only requires matrix-vector products with the linear model, which is critical for large-scale problems. Our numerical experiments on image classification and geometric programming illustrate that LSEMINK considerably reduces the time-to-solution and increases the scalability compared to geometric programming and natural gradient descent approaches. It has significantly faster initial convergence than standard Newton–Krylov methods, which is particularly attractive in applications like machine learning. In addition, LSEMINK is more robust to ill-conditioning arising from the nonsmoothness of the problem. We share our MATLAB implementation at a GitHub repository (https://github.com/KelvinKan/LSEMINK).
more » « less
Full Text Available
Learning Control Policies of Hodgkin-Huxley Neuronal Dynamics

Madondo, Malvern; Verma, Deepanshu; Ruthotto, Lars; Au_Yong, Nicholas (December 2023, 3rd Machine Learning for Health Symposium)

We present a neural network approach for closed-loop deep brain stimulation (DBS). We cast the problem of finding an optimal neurostimulation strategy as a control problem. In this setting, control policies aim to optimize therapeutic outcomes by tailoring the parameters of a DBS system, typically via electrical stimulation, in real time based on the patient’s ongoing neuronal activity. We approximate the value function offline using a neural network to enable generating controls (stimuli) in real time via the feedback form. The neuronal activity is characterized by a nonlinear, stiff system of differential equations as dictated by the Hodgkin-Huxley model. Our training process leverages the relationship between Pontryagin’s maximum principle and Hamilton-Jacobi-Bellman equations to update the value function estimates simultaneously. Our numerical experiments illustrate the accuracy of our approach for out-of-distribution samples and the robustness to moderate shocks and disturbances in the system.
more » « less
Full Text Available
Learning Control Policies of Hodgkin-Huxley Neuronal Dynamics

Madondo, Malvern; Verma, Deepanshu; Ruthotto, Lars; Au_Yong, Nicholas (November 2023, ML4Health Findings Track Collection)

Full Text Available

« Prev Next »

Search for: All records